De-duplicating backup tool on a block basis? [closed]

Posted by SST on Server Fault See other posts from Server Fault or by SST
Published on 2012-10-11T14:48:23Z Indexed on 2012/10/11 15:38 UTC
Read the original article Hit count: 210

Filed under:
|
|
|

I am looking for an (ideally free as in speech or beer) backup tool for Unix-like OS which can store deduplicated backups, i.e. only nonredundant content takes up additional space.

I already looked at dirvish (my first candidate) and rsnapshot which use hardlinks to achieve deduplication on a per-file level. However, as I want to back up large files (Thunderbird mailboxes >3GB, VMware images >10GB), such file are stored again entirely even if just a few bytes change. Then there are rsync-based tools like rdiff-backup which only store deltas and a current mirror. However, as the deltas are generated against each previous mirror, it is difficult to fine-tune the retention granularity (only keep one backup after a week, etc.) because the deltas would have to be re-evaluated. Another approach is to partition content into blocks and store each block only if it is not stored yet, otherwise just linking it to the first occurrence. The only tool I know of that does this by now is obnam (http://liw.fi/obnam), and it even supports zlib-compression and gpg-encryption -- nice! But it is very slow, AFAICT.

Does any one know any other, solid backup software which supports deduplication on a sub-file level, ideally with at least some management options (show/select/delete generations...)?

© Server Fault or respective owner

Related posts about backup

Related posts about rsync